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Abstract 

Fractal image compression, Culik's image compression and zerotree prediction coding of 
wavelet image decomposition coefficients succeed only because typical images being compressed 
possess a significant degree of self-similarity. This is a unifying, common concept of these 
seemingly dissimilar compression techniques, which may not be apparent due to particular 
terminologies each of the methods uses. Besides the common concept, these methods turn out to 
be even more tightly related, to the point of algorithmical reducibility of one technique to another. 
The goal of the present paper is to demonstrate these relations. 

The paper offers a plain-term interpretation of Culik's image compression, a very capable 
yet undeservingly underrepresented method giving spectacular results. The Culik's method will be 
explained in regular image processing terms, without resorting to finite state machines and similar 
lofty language. The interpretation is shown to be algorithmically related to an IFS fractal image 
compression method: an IFS can be exactly transformed into Culik's image code. Using this 
transformation, we will prove that in a self-similar (part of an) image any zero wavelet coefficient 
is the root of a zerotree, or its branch. 

The paper discusses the zerotree coding of (wavelet/projection) coefficients as a common 
predictor/corrector, applied vertically through different layers of a multiresolutional 
decomposition, rather than within the same view. This interpretation leads to an insight into the 
evolution of image compression techniques: from a causal single-layer prediction, to non-causal 
same-view predictions (wavelet decomposition among others) and to a causal cross-layer 
prediction (zero-trees, Culik's method). A non-causal cross-level prediction appears to be the next 
step. Will someone take it? 

I. Introduction 

The present paper deals with analysis, generalizations and unifications of 
the latest group of powerful image compression techniques: fractal image 
compression with Iterated Function Systems (IFS) [BARN93], Culik's 
compression with finite automata [CULI95] and Shapiro's embedded coding of 
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wavelet coefficients using zerotrees [SHAP94]. All three techniques achieve 
premium results by exploiting properties of self-similarity of typical images. In 
more precise terms, they all rely on the fact that parts of image representations at 
different resolutions may in some sense be similar. Therefore, a higher- 
resolution representation may be rather accurately predicted from a low- 
resolution one. This leads to compression due to compactness of the low- 
resolution view and smallness of the prediction errors (corrections). 

Although this conceptual unity is fairly obvious, details of the precise 
relationship among these methods are a bit obscure. This is partly because of 
specialized non-intersecting terminology domains used to describe these 
techniques: iterated transforms, finite automata, wavelet image transform. In the 
present paper, we will show that all three methods can be formulated in plain 
terms of a common language, which makes the kinship of these techniques 
manifest. Furthermore, it turns out that these methods are not only conceptually 
related, they are algorithmically reducible as well. The paper demonstrates an 
algorithm by which an image coding with one technique can be exactly 
transformed into another method's image code, with both codes yielding identical 
reconstruction results. Specifically, we will show how an IFS can be rendered in 
terms of projection matrices of Culik's method. Although these techniques appear 
to function in opposite ways (an IFS iteration shrinks the image iterated upon, 
while a Culik's iteration expands it), the reduction of one to the other is indeed 
possible, with both iterations producing identical results at all steps up to the final 
reconstructed image. This transformation also allows us to demonstrate in exact, 
precise terms how self-similarity of a part of an image gives rise to a zerotree of 
corresponding wavelet coefficients. In other words, if an image can be adequately 
represented by an IFS, every zero/insignificant wavelet coefficient in its 
decomposition is a root of a zerotree branch. 

The three methods above can be considered the latest step in evolution of 
image compression techniques. Since every compressor is based on modelling 
(prediction) of a source and compact representation (or disregarding) of the 
prediction errors, what sets different algorithms apart is whether prediction is 
causal, and what quantity is predicted. For example, CCITT Group III, JPEG 
lossless, etc., use a causal prediction of a pixel from its same-resolution 
neighborhood. A Laplacian pyramid decomposition, perfected by a Wavelet 
image transform, is an example of a non-causal prediction, as first noted by Burt 
[BURT83]. There, the neighborhood surrounds the pixel in question on all flanks. 
This usually leads to a more accurate prediction (and, therefore, smaller 
correction). Then came a zerotree coding, a causal prediction of a 
coefficient/pixel based upon its resolutional neighborhood. This cross-resolutional 
prediction is indeed causal: if a parent tree node is zero (insignificant), all kid 
nodes are anticipated to be zeros as well. Non-causal cross-resolution predictor 
awaits: wavelet decomposition of layers of wavelet decomposition? 
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II. Culik's method revealed: fat pixels and exposing projectors 

Culik's method is based on an alternative exact representation of an image 
as a single "fat" pixel, which gets stretched and smeared during repeated 
expansion operations, until it covers the whole area of the original picture. 
Unlike a regular, "thin", pixel (which holds a single value: brightness of the 
corresponding picture element), the fat pixel is a vector. The brightness of the 
corresponding picture element is computed as a linear combination of the fat 
pixel vector elements. In the simplest case, one can consider the first element of a 
fat pixel vector to be a "visible" brightness, with the rest of the vector values 
being "hidden". The hidden values show up during projection by four matrices, 
which arrange fat pixel(s) into four quadrants of a larger picture. This 
representation of an image by a single fat pixel is always possible, and the 
original image can be reconstructed in its entirety. As an example, the picture 
below shows a representation of a 4x4 image by a single 16- vector (fat pixel). 
Different pixels are numbered 1 through 16: these are merely pixel labels rather 
than actual pixel values. 
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Fig. 1. Example of fat pixel revelations to precisely reconstruct an image. Far left: the original 
image. Far right: a single fat pixel with 16 components. Center: a partial revelation of hidden 
components (grayed) upon application of the four transformation matrices. 

There are only four projection matrices, C0-C3, which are applied over and over 
again to produce an image at a finer resolution. For example, applying transform 
Co to the original fat pixel (Fig. 1, far right) makes a lower-left fat pixel of a 2x2 
square, at the center of Fig. 1. Applying Co again, to the entire 2x2 square, gives 
the lower-left quadrant of the image on Fig. 1, far left. The projection matrices 
in the example above are trivial: 



1000000000000000 
0000100000000000 
0000000010000000 
0000000000001000 



0100000000000000 
0000010000000000 
0000000001000000 
0000000000000100 



0010000000000000 
0000001000000000 
0000000000100000 
0000000000000010 



0001000000000000 
0000000100000000 
0000000000010000 
0000000000000001 



Ci 



c 3 c 
Fig. 2. Projection matrices for Fig. 1. 
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They obviously are permutation matrices: matrix Ci picks up every forth element 
of a vector, matrix C3 picks the next ones, etc. 

The image representation above does not yet give any compression; 
moreover, we need an additional space to store coefficients of the projection 
matrices. However, it might turn out that the original image or its close 
approximation can be reconstructed with less fat pixels. For example, consider a 
Sierpinski gasket: 




Fig. 3. Making of the Sierpinski gasket: two steps of expanding a thin pixel. 

As the figure shows, one needs only a single "thin" pixel and four lxl matrices 
Co=Ci=C2=l, C3=0 to make the gasket at any resolution. Another example is a 
diagonal grayscale ramp (this example is almost identical to the one given in 
Culik's paper [CULI95]). The numbers in squares in the figure below are the 
pixel values themselves, on a 1-256 scale. 
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Fig. 4a. Original fat pixel (a 
hidden component is grayed) 



Fig. 4b. One step of 
transformation 



Fig. 4c. Two steps of 
transformation (all hidden 
components have the value of 
256, and not shown) 



The projection matrices are as follows: 
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(1) 



One can iterate further to obtain a bigger image, with smoother gray scale 
gradations. In any case, one needs only a single 2-vector (fat pixel) and four 2x2 
matrices, 18 short integers total, to represent even 256x256 and bigger images. 

As Culik's presentations at DCC conferences have demonstrated, even 
realistic pictures (of lenna, among others) can be represented quite compactly 
with only 300 or so coefficients total (as compared to V4M pixels in case of a 
512x512 grayscale picture). 
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III. Culik's Compression and Iterated Function Systems 



Examples of Culik's iterations shown above strongly suggest that Culik's 
method must be very closely related to iterated function systems. It is indeed: in 
this section, we will show how one can convert an IFS into Culik's transform/fat 
pixel. 

Iterated Function System (IFS) is a finite collection of contraction 
mappings [BARN93]. In practice [BARN93, KOMI95], these mappings are 
usually specified as transformations between two partitionings of the same image 
into blocks. One, a finer scale partitioning into range blocks, is usually a regular 
tiling of the image into non-overlapping, usually 4x4 blocks. Another 
partitioning uses bigger blocks, called domain blocks, which can overlap and do 
not have to cover the whole picture. Usually domain blocks are twice as big as the 
range blocks. An IFS is made of separate transformations from a domain block to 
a range block. A single transformation squeezes the domain block and linearly 
adjusts its brightness. For example, the figure below depicts a very simple IFS 
with a single domain block and four smaller range blocks: 




Fig. 5. Sample IFS with a single domain block 



Note that the exact sizes of the blocks are irrelevant in this example. The only 
thing that matters is that the range blocks and the domain block both partition the 
same image, and that range blocks are half as big in each dimension as compared 
to a domain block. A linear transform of block's brightness aD+(5 applies to all 
pixels of the domain block D. For example, starting with a square image with a 
uniform brightness (grayscale) value y, and applying the transformations above 
once, and then again, one obtains: 
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Fig. 6a. Application of IFS, Fig. 5, to a square 
image of uniform brightness y considered as a 
single domain block 
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Fig. 6b. Application of IFS, Fig. 5, to a an 
image Fig. 6a considered as a single domain 
block 



The result of one iteration is an image of the same size but with four times as 
many details. Once the size of a detail diminishes down to one pixel, one may stop 
iterating: for all practical purposes, "convergence" is achieved. It is obvious that 
the content of the starting image becomes less and less important, as it shrinks 
twice at each iteration. Moreover, providing \a t \ < 1, all the series on Fig. 6b 
converge, to a limit not depending on the initial value y. 

Precisely the same result can be obtained with Culik's transforms, with the 
original fat pixel i and projection matrices as follows: 



1 = 







1 



k = 0,1,2,3 



(2) 



Indeed, applying the Culik's projection once to the fat pixel i gives a picture 
exactly like Fig. 6a. The only difference is that each quadrant is now a pixel 
(rather than a square 'subimage'), and it is a fat pixel with a hidden value of 1. 
Applying the projection once again results in Fig. 6b, with the identical 
interpretation. In general, it is obvious that an IFS launched from a square image 
of size 2 m , and the Culik' s transform give identical (and identically sized) results 
after m iterations each. 



with 



Note Fig. 4 above is a particular case of this example, Fig. 6 and eq. (2), 



, k - 72 , p = 128, ^ =/3 2 = 64, p 3 =0,y = 128 (3) 

Let us consider now a more complex IFS, with several domain blocks. 
First, we will try an example with a single transform, a mapping between a 
domain and a range block: 
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Fig. 7. Sample IFS with a single domain-to-range block mapping 
Iterating upon a square with a uniform brightness y yields, in turn: 
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Fig. 8a. Application of IFS, 
Fig. 7, to a square image of 

uniform brightness y 
considered covered by the 4 
domain blocks 



Fig. 8b. Application of IFS, 
Fig. 7, to the image Fig. 8a 
considered covered by the 4 



Fig. 8c. Application of IFS, 
Fig. 7, to the image Fig. 8b 
considered covered by the 4 



right quadrant is shown 
The corresponding Culik's transformation is: 



domain blocks. Only the lower domain blocks. Only the lower 

right quadrant is shown. 
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It is evident that the first iteration of projecting the starting fat pixel i leads to a 
picture on the right-hand side of Fig. 7; the second iteration results in Fig. 8a. 
Iterating once more gives, in turn, Fig. 8b and Fig. 8c, etc. 

A more complex example involves two domain blocks and two "mutually 
dependent" transforms: 
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Fig. 9. Sample IFS with a mutually-dependent domain-to-range block mapping 
which converges to something like 



7 



Fig. 10. Third iteration of IFS Fig. 7 

The fat pixel i and the projection matrices corresponding to the example are as 
follows: 
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where I is a 3x3 unit matrix and is a 3x3 zero matrix. As one can easily verify 
by applying these projection matrices to the fat pixel i over and over again, the 
result at each iteration is identical to that for the IFS Fig. 9. 

Thus the IFS and Culik's image compression methods are indeed very 
tightly related, despite outward differences: an IFS iteration shrinks and 
reshuffles input image tiles, while a Culik's iteration merely rearranges its input, 
always in the same regular way. The Culik's method starts with a single domain 
block (that is, the entire image, or a "fat" pixel), and uses "range" blocks of the 
same size as the domain block itself. However, the Culik' s method makes up for 
the lost "translational" degrees of freedom by using "fat" pixels and a more 
complex luminance transform: although linear, but vector rather than scalar. 
Similar to IFS, Culik's matrices are required not to amplify pixels' luminance 
[CULI95]; i.e., the matrices should be contractive, or at least, not expanding. 
Finally, as examples above show, an IFS can indeed be algorithmically reduced to 
a Culik' s transform. The general algorithm of this reduction and its inverse are 
discussed in more detail in the paper. 

It is obvious from the examples above that an IFS with k transforms 
requires a 2f/:+7 y )-element fat pixel and four 2(k+l)x2(k+l) projection matrices. 
Note that the bottom half of these matrices is just a unit matrix, which does not 
have to be stored at all. The upper halves are also very sparse, which ought to be 
taken advantage of. For example, one can regard a projection matrix as a distance 
matrix for a directed weighted graph. Since the matrix is sparse, the 
corresponding graph would have rather few edges, which can be more efficiently 
stored as a list. Thus we come to exactly the same weighted graphs Culik 
originally used to represent his finite automata [CULI95]. Hence, the automata 
described by Culik are nothing but a neat trick to efficiently store and use sparse 
projection matrices. 
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IV. An IFS image has a zerotree of wavelet coefficients 

The title of the section is actually a formulation of a theorem the paper 
presents and proves. In precise terms, within an image or a part of it with a 
property of self- similarity, i.e., which can be adequately described/reproduced by 
an IFS, a zero wavelet coefficient is always a root of a zerotree branch. In other 
words, if a wavelet coefficient at some particular resolution turns out to be zero 
(exactly or within some tolerance), all the child coefficients, at finer resolutions, 
will be zero as well (exactly or within the same tolerance). Thus, as long as an 
image has enough self- similarity to allow efficient compression by an IFS (or, 
which is the same, by Culik's method), a zerotree coding of wavelet coefficients 
would be beneficial. This is the unifying idea mentioned above; the theorem gives 
it a more precise meaning. 

Because of space constraints, we will show the proof of the theorem on a 
small but characteristic example. We will analyze a case of a simple Haar wavelet 
transform, which has very short wavelet filters, spanning, in 2D, over a cluster 
of four "pixels". Consider a set of zoomed-out views of a self-similar (part of a) 
picture, and assume that the top view is made of the 4-pixel cluster. Following the 
premise of self-similarity, all these views are well described by an IFS, or (which 
is the same as we saw above) by a Culik's transform. Let corresponding "fat" 
pixels of the cluster be Fo, Fi, F2, and F3, and the projection matrices C0-C3. 
Let us arrange the four-pixel neighborhood in a block- vector F=(Fo Fj F2 F3)'. 
Note that the fat pixels are vectors themselves, that is why we call F a block- 
vector. Finer resolution views of the cluster can be obtained by projecting it with 
(block) matrices Ci. For example, the lower-left quadrant of the cluster at a 
higher resolution can be computed as 
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where Co is a block-projection matrix. The pixels F0-F3 can be combined to 
yield a (fat) wavelet coefficient W, by using a 2D (high-pass) filter with 
coefficients [ho,hi,h2,h3]: 



W = HF = 



( h Q I h x l h 2 I h 3 A 
IiqI h^I h 2 I h^I 
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(7) 



where / is a unit matrix of the size that of F;. The pivoting point of the proof is 
the fact that matrix H is commutative with Ci. This is easy to see by directly 
computing CiH and HQ, which in both cases gives: 
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C,H = HC, = 



i 
i 



(h q \c i h 2 q h 3 q) (8) 



A child wavelet coefficient at any finer resolution can be computed then as 

W ki , = HC: C: L C: F = C: C: L C: HF (9) 

Note that the first element of a fat pixel Fu Fi , is the visible pixel. The hidden 
elements are either 1, or can be set to 1, because it does not matter in the case of 
contracting matrices Ci, as we saw above. Since the wavelet filter H is high-pass, 
(ho hi h2 h3)(l 1 1 1)' is exactly zero. Therefore, if the wavelet-filtering of 
visible pixels Fi gives zero as well, the entire fat wavelet coefficient W=HF is a 
zero matrix. It follows then from eq. (9) that all the children wavelet coefficients 
are zeros as well. 

One can easily accommodate other wavelet filters by considering larger 
neighborhood of pixels. Block- vector F and block-matrices Ci would have more 
block-rows/columns, but the derivations remain the same. It is also easy to 
generalize the result to a case when a wavelet coefficient is not exactly zero, but 
small. As long as matrices Ci are not expanding (that is, convergence is 
guaranteed), all kid wavelet coefficients would be just as small as their parent. 
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